Climate change#
In the following sections, we use Python to demonstrate how to access multiples datasets from the Climate change sub-catalog.
Environment setup#
[1]:
from distributed import Client
import intake
import hvplot.xarray
import hvplot.pandas
from dask.distributed import PipInstall
import dask
import xoak
import xarray as xr
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import xarray as xr
import numpy as np
import dask
from dask.diagnostics import progress
from tqdm.autonotebook import tqdm
import fsspec
import seaborn as sns
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
/tmp/ipykernel_3411/326864322.py:16: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from tqdm.autonotebook import tqdm
We use a Dask client to ensure all following code compatible with the framework run in parallel
[2]:
client = Client()
client
[2]:
Client
Client-6993549e-6a74-11ed-8d53-000d3aee4b33
| Connection method: Cluster object | Cluster type: distributed.LocalCluster |
| Dashboard: http://127.0.0.1:8787/status |
Cluster Info
LocalCluster
e435adea
| Dashboard: http://127.0.0.1:8787/status | Workers: 2 |
| Total threads: 2 | Total memory: 6.78 GiB |
| Status: running | Using processes: True |
Scheduler Info
Scheduler
Scheduler-e58d52c4-d713-4a41-bc29-250ffcfe3cbc
| Comm: tcp://127.0.0.1:44539 | Workers: 2 |
| Dashboard: http://127.0.0.1:8787/status | Total threads: 2 |
| Started: Just now | Total memory: 6.78 GiB |
Workers
Worker: 0
| Comm: tcp://127.0.0.1:37637 | Total threads: 1 |
| Dashboard: http://127.0.0.1:46757/status | Memory: 3.39 GiB |
| Nanny: tcp://127.0.0.1:46509 | |
| Local directory: /tmp/dask-worker-space/worker-3usfpuob | |
Worker: 1
| Comm: tcp://127.0.0.1:35669 | Total threads: 1 |
| Dashboard: http://127.0.0.1:38079/status | Memory: 3.39 GiB |
| Nanny: tcp://127.0.0.1:41179 | |
| Local directory: /tmp/dask-worker-space/worker-crmg9pwv | |
Accessing the data#
We are now ready to access our catalog which uses Intake-ESM to organize all our datasets.
Intake is a lightweight package for finding, investigating, loading and disseminating data. A cataloging system is used to organize a collection of datasets and data loaders (drivers) are parameterized such that datasets are opened in the desired format for the end user. In the python context, multi-dimensional xarrays could be opened with xarray’s drivers while polygons (shapefiles, geojson) could be opened with geopandas.
Here is the URL from where we can open the catalog:
a) CMIP6#
In order to arrange the collection of datasets, the catalogue itself makes references to various sub-datasets:
[3]:
col = intake.open_esm_datastore('https://storage.googleapis.com/cmip6/pangeo-cmip6.json')
col
pangeo-cmip6 catalog with 7674 dataset(s) from 514818 asset(s):
| unique | |
|---|---|
| activity_id | 18 |
| institution_id | 36 |
| source_id | 88 |
| experiment_id | 170 |
| member_id | 657 |
| table_id | 37 |
| variable_id | 700 |
| grid_label | 10 |
| zstore | 514818 |
| dcpp_init_year | 60 |
| version | 736 |
| derived_variable_id | 0 |
[4]:
col.df['table_id'].unique()
[4]:
array(['Amon', '6hrPlev', '3hr', 'day', 'EmonZ', 'E3hr', '6hrPlevPt',
'AERmon', 'LImon', 'CFmon', 'Lmon', 'fx', 'SImon', 'Ofx', 'Omon',
'EdayZ', 'Emon', 'CFday', 'AERday', 'Eday', 'Oyr', 'Eyr', 'Oday',
'SIday', 'AERmonZ', '6hrLev', 'E1hrClimMon', 'CF3hr', 'AERhr',
'Odec', 'Oclim', 'Efx', 'Aclim', 'SIclim', 'IfxGre', 'ImonGre',
'Eclim'], dtype=object)
[5]:
from itables import init_notebook_mode, show
from IPython.display import HTML
init_notebook_mode(all_interactive=False)
show(col.df,
tags="<caption>Catalog</caption>",
column_filters="footer",
dom="lrtip")
| activity_id | institution_id | source_id | experiment_id | member_id | table_id | variable_id | grid_label | zstore | dcpp_init_year | version |
|---|---|---|---|---|---|---|---|---|---|---|
| Loading... (need help?) | activity_id | institution_id | source_id | experiment_id | member_id | table_id | variable_id | grid_label | zstore | dcpp_init_year | version |
[6]:
# load a few models to illustrate the problem
query = dict(experiment_id=["ssp585"],
variable_id="tasmax",
grid_label="gn",
table_id='Amon',
member_id='r1i1p1f1'
)
cat = col.search(**query)
xarray_kwargs = {'consolidated': True, 'decode_times':False}
with dask.config.set(**{'array.slicing.split_large_chunks': True}):
dset_dict = cat.to_dataset_dict(xarray_open_kwargs=xarray_kwargs)
--> The keys in the returned dictionary of datasets are constructed as follows:
'activity_id.institution_id.source_id.experiment_id.table_id.grid_label'
WARNING:google.auth._default:Authentication failed using Compute Engine authentication due to unavailable metadata server.
WARNING:google.auth.compute_engine._metadata:Compute Engine Metadata server unavailable on attempt 1 of 5. Reason: HTTPConnectionPool(host='metadata.google.internal', port=80): Max retries exceeded with url: /computeMetadata/v1/instance/service-accounts/default/?recursive=true (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fe62c63f7f0>: Failed to establish a new connection: [Errno -2] Name or service not known'))
WARNING:google.auth.compute_engine._metadata:Compute Engine Metadata server unavailable on attempt 2 of 5. Reason: HTTPConnectionPool(host='metadata.google.internal', port=80): Max retries exceeded with url: /computeMetadata/v1/instance/service-accounts/default/?recursive=true (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fe62c63fc40>: Failed to establish a new connection: [Errno -2] Name or service not known'))
WARNING:google.auth.compute_engine._metadata:Compute Engine Metadata server unavailable on attempt 3 of 5. Reason: HTTPConnectionPool(host='metadata.google.internal', port=80): Max retries exceeded with url: /computeMetadata/v1/instance/service-accounts/default/?recursive=true (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fe62c63f460>: Failed to establish a new connection: [Errno -2] Name or service not known'))
WARNING:google.auth.compute_engine._metadata:Compute Engine Metadata server unavailable on attempt 4 of 5. Reason: HTTPConnectionPool(host='metadata.google.internal', port=80): Max retries exceeded with url: /computeMetadata/v1/instance/service-accounts/default/?recursive=true (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fe62c6601f0>: Failed to establish a new connection: [Errno -2] Name or service not known'))
WARNING:google.auth.compute_engine._metadata:Compute Engine Metadata server unavailable on attempt 5 of 5. Reason: HTTPConnectionPool(host='metadata.google.internal', port=80): Max retries exceeded with url: /computeMetadata/v1/instance/service-accounts/default/?recursive=true (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fe62c6607f0>: Failed to establish a new connection: [Errno -2] Name or service not known'))
[7]:
[key for key in dset_dict.keys()]
[7]:
['ScenarioMIP.FIO-QLNM.FIO-ESM-2-0.ssp585.Amon.gn',
'ScenarioMIP.BCC.BCC-CSM2-MR.ssp585.Amon.gn',
'ScenarioMIP.CCCma.CanESM5.ssp585.Amon.gn',
'ScenarioMIP.CSIRO-ARCCSS.ACCESS-CM2.ssp585.Amon.gn',
'ScenarioMIP.MIROC.MIROC6.ssp585.Amon.gn',
'ScenarioMIP.DKRZ.MPI-ESM1-2-HR.ssp585.Amon.gn',
'ScenarioMIP.MRI.MRI-ESM2-0.ssp585.Amon.gn',
'ScenarioMIP.NCAR.CESM2-WACCM.ssp585.Amon.gn',
'ScenarioMIP.AWI.AWI-CM-1-1-MR.ssp585.Amon.gn',
'ScenarioMIP.MPI-M.MPI-ESM1-2-LR.ssp585.Amon.gn',
'ScenarioMIP.CAS.FGOALS-g3.ssp585.Amon.gn',
'ScenarioMIP.NUIST.NESM3.ssp585.Amon.gn']
[8]:
ds = dset_dict[list(dset_dict.keys())[0]]
ds
[8]:
<xarray.Dataset>
Dimensions: (lat: 192, bnds: 2, lon: 288, member_id: 1,
dcpp_init_year: 1, time: 1032)
Coordinates:
height float64 ...
* lat (lat) float64 -90.0 -89.06 -88.12 ... 88.12 89.06 90.0
lat_bnds (lat, bnds) float64 dask.array<chunksize=(192, 2), meta=np.ndarray>
* lon (lon) float64 0.0 1.25 2.5 3.75 ... 355.0 356.2 357.5 358.8
lon_bnds (lon, bnds) float64 dask.array<chunksize=(288, 2), meta=np.ndarray>
* time (time) int64 0 708 1416 2148 ... 750420 751152 751884 752616
time_bnds (time, bnds) float64 dask.array<chunksize=(1032, 2), meta=np.ndarray>
* member_id (member_id) object 'r1i1p1f1'
* dcpp_init_year (dcpp_init_year) float64 nan
Dimensions without coordinates: bnds
Data variables:
tasmax (member_id, dcpp_init_year, time, lat, lon) float32 dask.array<chunksize=(1, 1, 298, 192, 288), meta=np.ndarray>
Attributes: (12/62)
Conventions: CF-1.7 CMIP-6.2
activity_id: ScenarioMIP
branch_method: standard
branch_time_in_child: 0.0
branch_time_in_parent: 60225.0
cmor_version: 3.4.0
... ...
intake_esm_attrs:variable_id: tasmax
intake_esm_attrs:grid_label: gn
intake_esm_attrs:zstore: gs://cmip6/CMIP6/ScenarioMIP/FIO-QLNM/F...
intake_esm_attrs:version: 20200922
intake_esm_attrs:_data_format_: zarr
intake_esm_dataset_key: ScenarioMIP.FIO-QLNM.FIO-ESM-2-0.ssp585...[9]:
fig, axarr = plt.subplots(nrows=4, ncols=3, figsize=[30,20])
for ax,(k, ds) in zip(axarr.flat,dset_dict.items()):
if 'member_id' in ds.dims:
ds = ds.isel(member_id=0)
ds.coords['lon'] = (ds.coords['lon'] + 180) % 360 - 180
ds = ds.sortby(ds.lon)
da = ds.tasmax.isel(time=0).squeeze().plot.pcolormesh(ax=ax, cmap='coolwarm')
ax.set_title(k)
[10]:
fig, axarr = plt.subplots(nrows=4, ncols=3, figsize=[30,20])
for ax,(k, ds) in zip(axarr.flat,dset_dict.items()):
if 'member_id' in ds.dims:
ds = ds.isel(member_id=0)
da = ds.tasmax.sel(lon=280, lat=45, method='nearest').squeeze().plot(ax=ax, color='blue')
ax.set_title(k)
[11]:
[eid for eid in col.df['experiment_id'].unique() if 'ssp' in eid]
[11]:
['ssp585',
'ssp245',
'ssp370SST-lowCH4',
'ssp370-lowNTCF',
'ssp370SST-lowNTCF',
'ssp370SST-ssp126Lu',
'ssp370SST',
'ssp370pdSST',
'ssp119',
'ssp370',
'esm-ssp585-ssp126Lu',
'ssp126-ssp370Lu',
'ssp370-ssp126Lu',
'ssp126',
'esm-ssp585',
'ssp245-GHG',
'ssp245-nat',
'ssp460',
'ssp434',
'ssp534-over',
'ssp245-stratO3',
'ssp245-aer',
'ssp245-cov-modgreen',
'ssp245-cov-fossil',
'ssp245-cov-strgreen',
'ssp245-covid',
'ssp585-bgc']
[12]:
# there is currently a significant amount of data for these runs
expts = ['historical', 'ssp245', 'ssp585']
query = dict(
experiment_id=expts,
table_id='Amon',
variable_id=['tas'],
member_id = 'r1i1p1f1',
)
col_subset = col.search(require_all_on=["source_id"], **query)
col_subset.df.groupby("source_id")[
["experiment_id", "variable_id", "table_id"]
].nunique()
/usr/share/miniconda/envs/catalogs/lib/python3.8/site-packages/intake_esm/_search.py:80: FutureWarning: In a future version of pandas, a length 1 tuple will be returned when iterating over a groupby with a grouper equal to a list of length 1. Don't supply a list with a single grouper to avoid this warning.
for _, group in grouped:
[12]:
| experiment_id | variable_id | table_id | |
|---|---|---|---|
| source_id | |||
| ACCESS-CM2 | 3 | 1 | 1 |
| AWI-CM-1-1-MR | 3 | 1 | 1 |
| BCC-CSM2-MR | 3 | 1 | 1 |
| CAMS-CSM1-0 | 3 | 1 | 1 |
| CAS-ESM2-0 | 3 | 1 | 1 |
| CESM2-WACCM | 3 | 1 | 1 |
| CIESM | 3 | 1 | 1 |
| CMCC-CM2-SR5 | 3 | 1 | 1 |
| CMCC-ESM2 | 3 | 1 | 1 |
| CanESM5 | 3 | 1 | 1 |
| E3SM-1-1 | 3 | 1 | 1 |
| EC-Earth3 | 3 | 1 | 1 |
| EC-Earth3-CC | 3 | 1 | 1 |
| EC-Earth3-Veg | 3 | 1 | 1 |
| EC-Earth3-Veg-LR | 3 | 1 | 1 |
| FGOALS-f3-L | 3 | 1 | 1 |
| FGOALS-g3 | 3 | 1 | 1 |
| FIO-ESM-2-0 | 3 | 1 | 1 |
| GFDL-CM4 | 3 | 1 | 1 |
| GFDL-ESM4 | 3 | 1 | 1 |
| IITM-ESM | 3 | 1 | 1 |
| INM-CM4-8 | 3 | 1 | 1 |
| INM-CM5-0 | 3 | 1 | 1 |
| IPSL-CM6A-LR | 3 | 1 | 1 |
| KACE-1-0-G | 3 | 1 | 1 |
| KIOST-ESM | 3 | 1 | 1 |
| MIROC6 | 3 | 1 | 1 |
| MPI-ESM1-2-HR | 3 | 1 | 1 |
| MPI-ESM1-2-LR | 3 | 1 | 1 |
| MRI-ESM2-0 | 3 | 1 | 1 |
| NESM3 | 3 | 1 | 1 |
| NorESM2-LM | 3 | 1 | 1 |
| NorESM2-MM | 3 | 1 | 1 |
| TaiESM1 | 3 | 1 | 1 |
[13]:
def drop_all_bounds(ds):
drop_vars = [vname for vname in ds.coords
if (('_bounds') in vname ) or ('_bnds') in vname]
return ds.drop(drop_vars)
def open_dset(df):
assert len(df) == 1
ds = xr.open_zarr(fsspec.get_mapper(df.zstore.values[0]), consolidated=True)
return drop_all_bounds(ds)
def open_delayed(df):
return dask.delayed(open_dset)(df)
from collections import defaultdict
dsets = defaultdict(dict)
for group, df in col_subset.df.groupby(by=['source_id', 'experiment_id']):
dsets[group[0]][group[1]] = open_delayed(df)
[14]:
dsets_ = dask.compute(dict(dsets))[0]
[15]:
# calculate global means
def get_lat_name(ds):
for lat_name in ['lat', 'latitude']:
if lat_name in ds.coords:
return lat_name
raise RuntimeError("Couldn't find a latitude coordinate")
def global_mean(ds):
lat = ds[get_lat_name(ds)]
weight = np.cos(np.deg2rad(lat))
weight /= weight.mean()
other_dims = set(ds.dims) - {'time'}
return (ds * weight).mean(other_dims)
[16]:
expt_da = xr.DataArray(expts, dims='experiment_id', name='experiment_id',
coords={'experiment_id': expts})
dsets_aligned = {}
for k, v in tqdm(dsets_.items()):
expt_dsets = v.values()
if any([d is None for d in expt_dsets]):
print(f"Missing experiment for {k}")
continue
for ds in expt_dsets:
ds.coords['year'] = ds.time.dt.year
# workaround for
# https://github.com/pydata/xarray/issues/2237#issuecomment-620961663
dsets_ann_mean = [v[expt].pipe(global_mean)
.swap_dims({'time': 'year'})
.drop('time')
.coarsen(year=12).mean()
for expt in expts]
# align everything with the 4xCO2 experiment
dsets_aligned[k] = xr.concat(dsets_ann_mean, join='outer',
dim=expt_da)
100%|██████████| 34/34 [00:14<00:00, 2.34it/s]
[17]:
with progress.ProgressBar():
dsets_aligned_ = dask.compute(dsets_aligned)[0]
We can quickly choose data subsets in both space and time using xarray. Here, we choose July 19–20, 1996, a period when Quebec saw historically extreme precipitation (Canada). The graphic package hvplot can then be used to track the storm throughout the event.
[18]:
source_ids = list(dsets_aligned_.keys())
source_da = xr.DataArray(source_ids, dims='source_id', name='source_id',
coords={'source_id': source_ids})
big_ds = xr.concat([ds.reset_coords(drop=True)
for ds in dsets_aligned_.values()],
dim=source_da)
big_ds
[18]:
<xarray.Dataset>
Dimensions: (year: 451, experiment_id: 3, source_id: 34)
Coordinates:
* year (year) float64 1.85e+03 1.851e+03 ... 2.299e+03 2.3e+03
* experiment_id (experiment_id) <U10 'historical' 'ssp245' 'ssp585'
* source_id (source_id) <U16 'ACCESS-CM2' 'AWI-CM-1-1-MR' ... 'TaiESM1'
Data variables:
tas (source_id, experiment_id, year) float64 287.0 287.0 ... nan[ ]:
[19]:
df_all = big_ds.sel(year=slice(1900, 2100)).to_dataframe().reset_index()
df_all.head()
[19]:
| year | experiment_id | source_id | tas | |
|---|---|---|---|---|
| 0 | 1900.0 | historical | ACCESS-CM2 | 287.019917 |
| 1 | 1900.0 | historical | AWI-CM-1-1-MR | 286.958154 |
| 2 | 1900.0 | historical | BCC-CSM2-MR | 287.996260 |
| 3 | 1900.0 | historical | CAMS-CSM1-0 | 287.084974 |
| 4 | 1900.0 | historical | CAS-ESM2-0 | 287.263682 |
[20]:
sns.relplot(data=df_all,
x="year", y="tas", hue='experiment_id',
kind="line", errorbar='sd', aspect=2);
b) Cordex-NA#
[21]:
col = intake.open_esm_datastore('https://ncar-na-cordex.s3-us-west-2.amazonaws.com/catalogs/aws-na-cordex.json')
col
aws-na-cordex catalog with 330 dataset(s) from 330 asset(s):
| unique | |
|---|---|
| variable | 15 |
| standard_name | 10 |
| long_name | 18 |
| units | 10 |
| spatial_domain | 1 |
| grid | 2 |
| spatial_resolution | 2 |
| scenario | 6 |
| start_time | 3 |
| end_time | 4 |
| frequency | 1 |
| vertical_levels | 1 |
| bias_correction | 3 |
| na-cordex-models | 9 |
| path | 330 |
| derived_variable | 0 |
[22]:
# Show the first few lines of the catalog
show(col.df,
tags="<caption>Catalog</caption>",
column_filters="footer",
dom="lrtip")
| variable | standard_name | long_name | units | spatial_domain | grid | spatial_resolution | scenario | start_time | end_time | frequency | vertical_levels | bias_correction | na-cordex-models | path |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Loading... (need help?) | variable | standard_name | long_name | units | spatial_domain | grid | spatial_resolution | scenario | start_time | end_time | frequency | vertical_levels | bias_correction | na-cordex-models | path |
[23]:
data_var = 'tmax'
col_subset = col.search(
variable=data_var,
grid="NAM-44i",
bias_correction="raw",
scenario='rcp45'
)
col_subset
aws-na-cordex catalog with 1 dataset(s) from 1 asset(s):
| unique | |
|---|---|
| variable | 1 |
| standard_name | 1 |
| long_name | 1 |
| units | 1 |
| spatial_domain | 1 |
| grid | 1 |
| spatial_resolution | 1 |
| scenario | 1 |
| start_time | 1 |
| end_time | 1 |
| frequency | 1 |
| vertical_levels | 1 |
| bias_correction | 1 |
| na-cordex-models | 1 |
| path | 1 |
| derived_variable | 0 |
[24]:
col_subset.df
[24]:
| variable | standard_name | long_name | units | spatial_domain | grid | spatial_resolution | scenario | start_time | end_time | frequency | vertical_levels | bias_correction | na-cordex-models | path | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | tmax | air_temperature | Daily Maximum Near-Surface Air Temperature | degC | north_america | NAM-44i | 0.50 deg | rcp45 | 2006-01-01T12:00:00 | 2100-12-31T12:00:00 | day | 1 | raw | ['MPI-ESM-LR.CRCM5-UQAM', 'CanESM2.CRCM5-UQAM'... | s3://ncar-na-cordex/day/tmax.rcp45.day.NAM-44i... |
[25]:
# Load catalog entries for subset into a dictionary of xarray datasets, and open the first one.
dsets = col_subset.to_dataset_dict(
xarray_open_kwargs={"consolidated": True}, storage_options={"anon": True}
)
print(f"\nDataset dictionary keys:\n {dsets.keys()}")
# Load the first dataset and display a summary.
dataset_key = list(dsets.keys())[0]
store_name = dataset_key + ".zarr"
ds = dsets[dataset_key]
ds
# Note that the summary includes a 'member_id' coordinate, which is a renaming of the
# 'na-cordex-models' column in the catalog.
--> The keys in the returned dictionary of datasets are constructed as follows:
'variable.frequency.scenario.grid.bias_correction'
Dataset dictionary keys:
dict_keys(['tmax.day.rcp45.NAM-44i.raw'])
[25]:
<xarray.Dataset>
Dimensions: (lat: 129, lon: 300, member_id: 6, time: 34698, bnds: 2)
Coordinates:
* lat (lat) float64 12.25 12.75 13.25 13.75 ... 74.75 75.25 75.75 76.25
* lon (lon) float64 -171.8 -171.2 -170.8 ... -23.25 -22.75 -22.25
* member_id (member_id) <U21 'MPI-ESM-LR.CRCM5-UQAM' ... 'CanESM2.CanRCM4'
* time (time) datetime64[ns] 2006-01-01T12:00:00 ... 2100-12-31T12:00:00
time_bnds (time, bnds) datetime64[ns] dask.array<chunksize=(17349, 2), meta=np.ndarray>
Dimensions without coordinates: bnds
Data variables:
tmax (member_id, time, lat, lon) float32 dask.array<chunksize=(4, 1000, 65, 150), meta=np.ndarray>
Attributes: (12/41)
CORDEX_domain: NAM-44
contact: {"MPI-ESM-LR.CRCM5-UQAM": "Winger.K...
creation_date: {"MPI-ESM-LR.CRCM5-UQAM": "2012-09-...
driving_experiment: {"MPI-ESM-LR.CRCM5-UQAM": "MPI-M-MP...
driving_experiment_name: rcp45
driving_model_ensemble_member: {"MPI-ESM-LR.CRCM5-UQAM": "r1i1p1",...
... ...
intake_esm_attrs:vertical_levels: 1
intake_esm_attrs:bias_correction: raw
intake_esm_attrs:na-cordex-models: ['MPI-ESM-LR.CRCM5-UQAM', 'CanESM2....
intake_esm_attrs:path: s3://ncar-na-cordex/day/tmax.rcp45....
intake_esm_attrs:_data_format_: zarr
intake_esm_dataset_key: tmax.day.rcp45.NAM-44i.raw[26]:
ds.tmax \
.sel(lat=45, lon=-75, method='nearest') \
.hvplot(x='time',by='member_id', width=750, height=500, grid=True) \
.opts(legend_position='bottom')
[26]: